Text Mining for Open Domain Semi-Supervised Semantic Role Labeling

نویسندگان

  • Quynh Ngoc Thi Do
  • Steven Bethard
  • Marie-Francine Moens
چکیده

The identification and classification of some circumstance semantic roles like Location, Time, Manner and Direction, a task of Semantic Role Labeling (SRL), plays a very important role in building text understanding applications. However, the performance of the current SRL systems on those roles is often very poor, especially when the systems are applied on domains other than the ones they are trained on. We present a method to build open domain SRL system, in which the training data is expanded by replacing its predicates by words in the testing domain. A language model, which is considered as a text mining technique, and some linguistic resources are used to select from the vocabulary of the testing domain the best words for the replacement. We apply our method on the case study of transferring a semantic role labeler trained on the news domain to the children story domain. It gives us valuable improvements over the four circumstance semantic roles Location, Time, Manner and Direction.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A New Method for Improving Computational Cost of Open Information Extraction Systems Using Log-Linear Model

Information extraction (IE) is a process of automatically providing a structured representation from an unstructured or semi-structured text. It is a long-standing challenge in natural language processing (NLP) which has been intensified by the increased volume of information and heterogeneity, and non-structured form of it. One of the core information extraction tasks is relation extraction wh...

متن کامل

Semi-Supervised Semantic Role Labeling

Large scale annotated corpora are prerequisite to developing high-performance semantic role labeling systems. Unfortunately, such corpora are expensive to produce, limited in size, and may not be representative. Our work aims to reduce the annotation effort involved in creating resources for semantic role labeling via semi-supervised learning. Our algorithm augments a small number of manually l...

متن کامل

Open-Domain Semantic Role Labeling by Modeling Word Spans

Most supervised language processing systems show a significant drop-off in performance when they are tested on text that comes from a domain significantly different from the domain of the training data. Semantic role labeling techniques are typically trained on newswire text, and in tests their performance on fiction is as much as 19% worse than their performance on newswire text. We investigat...

متن کامل

Domain Specific Automatic Question Generation from Text

The goal of my doctoral thesis is to automatically generate interrogative sentences from descriptive sentences of Turkish biology text. We employ syntactic and semantic approaches to parse descriptive sentences. Syntactic and semantic approaches utilize syntactic (constituent or dependency) parsing and semantic role labeling systems respectively. After parsing step, question statements whose an...

متن کامل

Semi-Supervised Semantic Role Labeling via Structural Alignment

Large-scale annotated corpora are a prerequisite to developing high-performance semantic role labeling systems. Unfortunately, such corpora are expensive to produce, limited in size, and may not be representative. Our work aims to reduce the annotation effort involved in creating resources for semantic role labeling via semi-supervised learning. The key idea of our approach is to find novel ins...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2014